Overview

Dataset statistics

Number of variables15
Number of observations732
Missing cells0
Missing cells (%)0.0%
Duplicate rows15
Duplicate rows (%)2.0%
Total size in memory642.0 KiB
Average record size in memory898.0 B

Variable types

Categorical9
Numeric6

Alerts

Dataset has 15 (2.0%) duplicate rowsDuplicates
pr_user has a high cardinality: 225 distinct values High cardinality
rev_user has a high cardinality: 218 distinct values High cardinality
Belege is highly correlated with RechtschreibungHigh correlation
Formatierung is highly correlated with SpracheHigh correlation
Strukturierung des Beitrags is highly correlated with Rechtschreibung and 4 other fieldsHigh correlation
Rechtschreibung is highly correlated with Belege and 5 other fieldsHigh correlation
Sprache is highly correlated with Formatierung and 5 other fieldsHigh correlation
Inhalt is highly correlated with Strukturierung des Beitrags and 4 other fieldsHigh correlation
Einordnung in den Kontext is highly correlated with Strukturierung des Beitrags and 4 other fieldsHigh correlation
Ansprechend für Zielgruppe is highly correlated with Strukturierung des Beitrags and 4 other fieldsHigh correlation
pr_user is uniformly distributed Uniform
rev_user is uniformly distributed Uniform

Reproduction

Analysis started2021-12-09 15:42:48.771010
Analysis finished2021-12-09 15:42:57.895304
Duration9.12 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

pr_user
Categorical

HIGH CARDINALITY
UNIFORM

Distinct225
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
fa97fyka
 
8
Ne88peni
 
8
mirela08
 
7
Steffi226
 
7
LaHe27
 
7
Other values (220)
695 

Length

Max length19
Median length8
Mean length9.087431694
Min length4

Characters and Unicode

Total characters6652
Distinct characters63
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.8%

Sample

1st rowAnnaSophieNi
2nd rowAnnaSophieNi
3rd rowGilchus
4th rowTommiMueller
5th rowTommiMueller

Common Values

ValueCountFrequency (%)
fa97fyka8
 
1.1%
Ne88peni8
 
1.1%
mirela087
 
1.0%
Steffi2267
 
1.0%
LaHe277
 
1.0%
jsk0lb6
 
0.8%
zo54hoko6
 
0.8%
5SY56
 
0.8%
AnjaKlostermeier6
 
0.8%
ig27oqaf6
 
0.8%
Other values (215)665
90.8%

Length

2021-12-09T15:42:58.007532image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fa97fyka8
 
1.1%
ne88peni8
 
1.1%
mirela087
 
1.0%
steffi2267
 
1.0%
lahe277
 
1.0%
alexandermueller2966
 
0.8%
za22regi6
 
0.8%
moritzbock6
 
0.8%
lianalia6
 
0.8%
fabrigh6
 
0.8%
Other values (215)665
90.8%

Most occurring characters

ValueCountFrequency (%)
a637
 
9.6%
e551
 
8.3%
i462
 
6.9%
n386
 
5.8%
r361
 
5.4%
l298
 
4.5%
o248
 
3.7%
s231
 
3.5%
u227
 
3.4%
t190
 
2.9%
Other values (53)3061
46.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4955
74.5%
Decimal Number949
 
14.3%
Uppercase Letter720
 
10.8%
Dash Punctuation28
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a637
12.9%
e551
11.1%
i462
 
9.3%
n386
 
7.8%
r361
 
7.3%
l298
 
6.0%
o248
 
5.0%
s231
 
4.7%
u227
 
4.6%
t190
 
3.8%
Other values (16)1364
27.5%
Uppercase Letter
ValueCountFrequency (%)
M73
 
10.1%
L71
 
9.9%
A67
 
9.3%
S65
 
9.0%
B35
 
4.9%
H34
 
4.7%
K33
 
4.6%
C33
 
4.6%
F32
 
4.4%
J32
 
4.4%
Other values (16)245
34.0%
Decimal Number
ValueCountFrequency (%)
2181
19.1%
0165
17.4%
1143
15.1%
9101
10.6%
481
8.5%
876
8.0%
764
 
6.7%
653
 
5.6%
351
 
5.4%
534
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
-28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5675
85.3%
Common977
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a637
 
11.2%
e551
 
9.7%
i462
 
8.1%
n386
 
6.8%
r361
 
6.4%
l298
 
5.3%
o248
 
4.4%
s231
 
4.1%
u227
 
4.0%
t190
 
3.3%
Other values (42)2084
36.7%
Common
ValueCountFrequency (%)
2181
18.5%
0165
16.9%
1143
14.6%
9101
10.3%
481
8.3%
876
7.8%
764
 
6.6%
653
 
5.4%
351
 
5.2%
534
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII6652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a637
 
9.6%
e551
 
8.3%
i462
 
6.9%
n386
 
5.8%
r361
 
5.4%
l298
 
4.5%
o248
 
3.7%
s231
 
3.5%
u227
 
3.4%
t190
 
2.9%
Other values (53)3061
46.0%

rev_user
Categorical

HIGH CARDINALITY
UNIFORM

Distinct218
Distinct (%)29.8%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
Teemoma
 
9
xyily
 
6
LKmps2021
 
6
nicolasrmg
 
6
LeahMtmb
 
6
Other values (213)
699 

Length

Max length19
Median length8
Mean length9.081967213
Min length4

Characters and Unicode

Total characters6648
Distinct characters63
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.5%

Sample

1st rowTommiMueller
2nd rowGilchus
3rd rowTommiMueller
4th rowChrisBohl
5th rowbiancamg

Common Values

ValueCountFrequency (%)
Teemoma9
 
1.2%
xyily6
 
0.8%
LKmps20216
 
0.8%
nicolasrmg6
 
0.8%
LeahMtmb6
 
0.8%
Kisara24266
 
0.8%
Christoph-Mantsch6
 
0.8%
Ersan426
 
0.8%
Maximilian2195
 
0.7%
DeniseWt5
 
0.7%
Other values (208)671
91.7%

Length

2021-12-09T15:42:58.190620image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
teemoma9
 
1.2%
lkmps20216
 
0.8%
nicolasrmg6
 
0.8%
leahmtmb6
 
0.8%
kisara24266
 
0.8%
christoph-mantsch6
 
0.8%
ersan426
 
0.8%
xyily6
 
0.8%
alexandermueller2965
 
0.7%
christian-anghel5
 
0.7%
Other values (208)671
91.7%

Most occurring characters

ValueCountFrequency (%)
a618
 
9.3%
e526
 
7.9%
i468
 
7.0%
n381
 
5.7%
r353
 
5.3%
l314
 
4.7%
s256
 
3.9%
o253
 
3.8%
u247
 
3.7%
h192
 
2.9%
Other values (53)3040
45.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4969
74.7%
Decimal Number928
 
14.0%
Uppercase Letter716
 
10.8%
Dash Punctuation35
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a618
12.4%
e526
 
10.6%
i468
 
9.4%
n381
 
7.7%
r353
 
7.1%
l314
 
6.3%
s256
 
5.2%
o253
 
5.1%
u247
 
5.0%
h192
 
3.9%
Other values (16)1361
27.4%
Uppercase Letter
ValueCountFrequency (%)
L76
 
10.6%
A75
 
10.5%
M73
 
10.2%
S51
 
7.1%
F40
 
5.6%
K35
 
4.9%
T34
 
4.7%
B32
 
4.5%
C30
 
4.2%
J28
 
3.9%
Other values (16)242
33.8%
Decimal Number
ValueCountFrequency (%)
2179
19.3%
1164
17.7%
0159
17.1%
997
10.5%
486
9.3%
864
 
6.9%
753
 
5.7%
347
 
5.1%
646
 
5.0%
533
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
-35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5685
85.5%
Common963
 
14.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a618
 
10.9%
e526
 
9.3%
i468
 
8.2%
n381
 
6.7%
r353
 
6.2%
l314
 
5.5%
s256
 
4.5%
o253
 
4.5%
u247
 
4.3%
h192
 
3.4%
Other values (42)2077
36.5%
Common
ValueCountFrequency (%)
2179
18.6%
1164
17.0%
0159
16.5%
997
10.1%
486
8.9%
864
 
6.6%
753
 
5.5%
347
 
4.9%
646
 
4.8%
-35
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII6648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a618
 
9.3%
e526
 
7.9%
i468
 
7.0%
n381
 
5.7%
r353
 
5.3%
l314
 
4.7%
s256
 
3.9%
o253
 
3.8%
u247
 
3.7%
h192
 
2.9%
Other values (53)3040
45.7%

Front-Matter
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
ja
674 
nein
 
58

Length

Max length4
Median length2
Mean length2.158469945
Min length2

Characters and Unicode

Total characters1580
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowja
2nd rowja
3rd rowja
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja674
92.1%
nein58
 
7.9%

Length

2021-12-09T15:42:58.349440image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:58.452244image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja674
92.1%
nein58
 
7.9%

Most occurring characters

ValueCountFrequency (%)
j674
42.7%
a674
42.7%
n116
 
7.3%
e58
 
3.7%
i58
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1580
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j674
42.7%
a674
42.7%
n116
 
7.3%
e58
 
3.7%
i58
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Latin1580
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j674
42.7%
a674
42.7%
n116
 
7.3%
e58
 
3.7%
i58
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j674
42.7%
a674
42.7%
n116
 
7.3%
e58
 
3.7%
i58
 
3.7%

Umfang
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size42.5 KiB
ja
614 
nein
117 
Nein
 
1

Length

Max length4
Median length2
Mean length2.322404372
Min length2

Characters and Unicode

Total characters1700
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowja
2nd rowja
3rd rowja
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja614
83.9%
nein117
 
16.0%
Nein1
 
0.1%

Length

2021-12-09T15:42:58.544249image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:58.646311image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja614
83.9%
nein118
 
16.1%

Most occurring characters

ValueCountFrequency (%)
j614
36.1%
a614
36.1%
n235
 
13.8%
e118
 
6.9%
i118
 
6.9%
N1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1699
99.9%
Uppercase Letter1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j614
36.1%
a614
36.1%
n235
 
13.8%
e118
 
6.9%
i118
 
6.9%
Uppercase Letter
ValueCountFrequency (%)
N1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1700
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j614
36.1%
a614
36.1%
n235
 
13.8%
e118
 
6.9%
i118
 
6.9%
N1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1700
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j614
36.1%
a614
36.1%
n235
 
13.8%
e118
 
6.9%
i118
 
6.9%
N1
 
0.1%

Belege
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
ja
662 
nein
69 
Ja
 
1

Length

Max length4
Median length2
Mean length2.18852459
Min length2

Characters and Unicode

Total characters1602
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowja
2nd rowja
3rd rownein
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja662
90.4%
nein69
 
9.4%
Ja1
 
0.1%

Length

2021-12-09T15:42:58.744448image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:58.847489image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja663
90.6%
nein69
 
9.4%

Most occurring characters

ValueCountFrequency (%)
a663
41.4%
j662
41.3%
n138
 
8.6%
e69
 
4.3%
i69
 
4.3%
J1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1601
99.9%
Uppercase Letter1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a663
41.4%
j662
41.3%
n138
 
8.6%
e69
 
4.3%
i69
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
J1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1602
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a663
41.4%
j662
41.3%
n138
 
8.6%
e69
 
4.3%
i69
 
4.3%
J1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1602
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a663
41.4%
j662
41.3%
n138
 
8.6%
e69
 
4.3%
i69
 
4.3%
J1
 
0.1%

Links
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.5 KiB
ja
646 
nein
86 

Length

Max length4
Median length2
Mean length2.234972678
Min length2

Characters and Unicode

Total characters1636
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowja
2nd rowja
3rd rownein
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja646
88.3%
nein86
 
11.7%

Length

2021-12-09T15:42:58.950299image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:59.067754image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja646
88.3%
nein86
 
11.7%

Most occurring characters

ValueCountFrequency (%)
j646
39.5%
a646
39.5%
n172
 
10.5%
e86
 
5.3%
i86
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1636
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j646
39.5%
a646
39.5%
n172
 
10.5%
e86
 
5.3%
i86
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Latin1636
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j646
39.5%
a646
39.5%
n172
 
10.5%
e86
 
5.3%
i86
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j646
39.5%
a646
39.5%
n172
 
10.5%
e86
 
5.3%
i86
 
5.3%

Formatierung
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
ja
676 
nein
 
56

Length

Max length4
Median length2
Mean length2.153005464
Min length2

Characters and Unicode

Total characters1576
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowja
2nd rowja
3rd rownein
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja676
92.3%
nein56
 
7.7%

Length

2021-12-09T15:42:59.162407image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:59.267641image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja676
92.3%
nein56
 
7.7%

Most occurring characters

ValueCountFrequency (%)
j676
42.9%
a676
42.9%
n112
 
7.1%
e56
 
3.6%
i56
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1576
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j676
42.9%
a676
42.9%
n112
 
7.1%
e56
 
3.6%
i56
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Latin1576
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j676
42.9%
a676
42.9%
n112
 
7.1%
e56
 
3.6%
i56
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j676
42.9%
a676
42.9%
n112
 
7.1%
e56
 
3.6%
i56
 
3.6%
Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size42.6 KiB
ja
578 
nein
153 
Nein
 
1

Length

Max length4
Median length2
Mean length2.420765027
Min length2

Characters and Unicode

Total characters1772
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rownein
2nd rownein
3rd rownein
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja578
79.0%
nein153
 
20.9%
Nein1
 
0.1%

Length

2021-12-09T15:42:59.362115image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:59.465530image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja578
79.0%
nein154
 
21.0%

Most occurring characters

ValueCountFrequency (%)
j578
32.6%
a578
32.6%
n307
17.3%
e154
 
8.7%
i154
 
8.7%
N1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1771
99.9%
Uppercase Letter1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j578
32.6%
a578
32.6%
n307
17.3%
e154
 
8.7%
i154
 
8.7%
Uppercase Letter
ValueCountFrequency (%)
N1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1772
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j578
32.6%
a578
32.6%
n307
17.3%
e154
 
8.7%
i154
 
8.7%
N1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1772
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j578
32.6%
a578
32.6%
n307
17.3%
e154
 
8.7%
i154
 
8.7%
N1
 
0.1%

Abbildungen
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.5 KiB
ja
653 
nein
79 

Length

Max length4
Median length2
Mean length2.215846995
Min length2

Characters and Unicode

Total characters1622
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowja
2nd rowja
3rd rowja
4th rowja
5th rowja

Common Values

ValueCountFrequency (%)
ja653
89.2%
nein79
 
10.8%

Length

2021-12-09T15:42:59.563606image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-09T15:42:59.663897image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
ja653
89.2%
nein79
 
10.8%

Most occurring characters

ValueCountFrequency (%)
j653
40.3%
a653
40.3%
n158
 
9.7%
e79
 
4.9%
i79
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1622
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j653
40.3%
a653
40.3%
n158
 
9.7%
e79
 
4.9%
i79
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1622
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j653
40.3%
a653
40.3%
n158
 
9.7%
e79
 
4.9%
i79
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j653
40.3%
a653
40.3%
n158
 
9.7%
e79
 
4.9%
i79
 
4.9%

Strukturierung des Beitrags
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.021857923
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:42:59.742200image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.55
Q18
median10
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.468152693
Coefficient of variation (CV)0.1627328545
Kurtosis8.007199061
Mean9.021857923
Median Absolute Deviation (MAD)0
Skewness-2.399388663
Sum6604
Variance2.15547233
MonotonicityNot monotonic
2021-12-09T15:42:59.862752image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
10388
53.0%
9153
 
20.9%
8104
 
14.2%
750
 
6.8%
616
 
2.2%
57
 
1.0%
16
 
0.8%
44
 
0.5%
34
 
0.5%
ValueCountFrequency (%)
16
 
0.8%
34
 
0.5%
44
 
0.5%
57
 
1.0%
616
 
2.2%
750
 
6.8%
8104
 
14.2%
9153
 
20.9%
10388
53.0%
ValueCountFrequency (%)
10388
53.0%
9153
 
20.9%
8104
 
14.2%
750
 
6.8%
616
 
2.2%
57
 
1.0%
44
 
0.5%
34
 
0.5%
16
 
0.8%

Rechtschreibung
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.857923497
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:42:59.983908image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q18
median9
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.428278044
Coefficient of variation (CV)0.1612429871
Kurtosis5.612720964
Mean8.857923497
Median Absolute Deviation (MAD)1
Skewness-1.946409586
Sum6484
Variance2.039978172
MonotonicityNot monotonic
2021-12-09T15:43:00.108157image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
10313
42.8%
9193
26.4%
8132
18.0%
746
 
6.3%
624
 
3.3%
513
 
1.8%
45
 
0.7%
14
 
0.5%
32
 
0.3%
ValueCountFrequency (%)
14
 
0.5%
32
 
0.3%
45
 
0.7%
513
 
1.8%
624
 
3.3%
746
 
6.3%
8132
18.0%
9193
26.4%
10313
42.8%
ValueCountFrequency (%)
10313
42.8%
9193
26.4%
8132
18.0%
746
 
6.3%
624
 
3.3%
513
 
1.8%
45
 
0.7%
32
 
0.3%
14
 
0.5%

Sprache
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.728142077
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:43:00.227589image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q18
median9
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.510066665
Coefficient of variation (CV)0.1730112378
Kurtosis4.301550033
Mean8.728142077
Median Absolute Deviation (MAD)1
Skewness-1.730826413
Sum6389
Variance2.280301331
MonotonicityNot monotonic
2021-12-09T15:43:00.342052image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
10292
39.9%
9189
25.8%
8123
16.8%
774
 
10.1%
626
 
3.6%
514
 
1.9%
47
 
1.0%
14
 
0.5%
32
 
0.3%
21
 
0.1%
ValueCountFrequency (%)
14
 
0.5%
21
 
0.1%
32
 
0.3%
47
 
1.0%
514
 
1.9%
626
 
3.6%
774
 
10.1%
8123
16.8%
9189
25.8%
10292
39.9%
ValueCountFrequency (%)
10292
39.9%
9189
25.8%
8123
16.8%
774
 
10.1%
626
 
3.6%
514
 
1.9%
47
 
1.0%
32
 
0.3%
21
 
0.1%
14
 
0.5%

Inhalt
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.095628415
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:43:00.487900image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q19
median10
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.328064951
Coefficient of variation (CV)0.1460113463
Kurtosis11.31776017
Mean9.095628415
Median Absolute Deviation (MAD)0
Skewness-2.70585198
Sum6658
Variance1.763756513
MonotonicityNot monotonic
2021-12-09T15:43:00.612345image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
10375
51.2%
9181
24.7%
8119
 
16.3%
727
 
3.7%
616
 
2.2%
16
 
0.8%
55
 
0.7%
42
 
0.3%
31
 
0.1%
ValueCountFrequency (%)
16
 
0.8%
31
 
0.1%
42
 
0.3%
55
 
0.7%
616
 
2.2%
727
 
3.7%
8119
 
16.3%
9181
24.7%
10375
51.2%
ValueCountFrequency (%)
10375
51.2%
9181
24.7%
8119
 
16.3%
727
 
3.7%
616
 
2.2%
55
 
0.7%
42
 
0.3%
31
 
0.1%
16
 
0.8%

Einordnung in den Kontext
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.855191257
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:43:00.763776image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q18
median10
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.760895445
Coefficient of variation (CV)0.198854592
Kurtosis4.876495717
Mean8.855191257
Median Absolute Deviation (MAD)0
Skewness-2.069898795
Sum6482
Variance3.100752768
MonotonicityNot monotonic
2021-12-09T15:43:00.899543image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
10401
54.8%
9120
 
16.4%
888
 
12.0%
747
 
6.4%
634
 
4.6%
524
 
3.3%
18
 
1.1%
25
 
0.7%
33
 
0.4%
42
 
0.3%
ValueCountFrequency (%)
18
 
1.1%
25
 
0.7%
33
 
0.4%
42
 
0.3%
524
 
3.3%
634
 
4.6%
747
 
6.4%
888
 
12.0%
9120
 
16.4%
10401
54.8%
ValueCountFrequency (%)
10401
54.8%
9120
 
16.4%
888
 
12.0%
747
 
6.4%
634
 
4.6%
524
 
3.3%
42
 
0.3%
33
 
0.4%
25
 
0.7%
18
 
1.1%

Ansprechend für Zielgruppe
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.139344262
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.8 KiB
2021-12-09T15:43:01.041642image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.55
Q19
median10
Q310
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.470078627
Coefficient of variation (CV)0.1608516525
Kurtosis9.40792256
Mean9.139344262
Median Absolute Deviation (MAD)0
Skewness-2.708206264
Sum6690
Variance2.16113117
MonotonicityNot monotonic
2021-12-09T15:43:01.187153image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
10430
58.7%
9149
 
20.4%
878
 
10.7%
738
 
5.2%
513
 
1.8%
611
 
1.5%
16
 
0.8%
34
 
0.5%
42
 
0.3%
21
 
0.1%
ValueCountFrequency (%)
16
 
0.8%
21
 
0.1%
34
 
0.5%
42
 
0.3%
513
 
1.8%
611
 
1.5%
738
 
5.2%
878
 
10.7%
9149
 
20.4%
10430
58.7%
ValueCountFrequency (%)
10430
58.7%
9149
 
20.4%
878
 
10.7%
738
 
5.2%
611
 
1.5%
513
 
1.8%
42
 
0.3%
34
 
0.5%
21
 
0.1%
16
 
0.8%

Interactions

2021-12-09T15:42:56.038904image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:51.478873image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.407306image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.311787image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.241495image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.152552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:56.189873image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:51.633323image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.559797image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.467974image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.389211image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.301518image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:56.343633image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:51.783572image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.705488image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.616126image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.538148image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.447820image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:56.490057image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:51.941681image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.857944image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.769628image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.698492image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.596612image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:56.963633image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.094289image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.005345image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.918779image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.851302image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.737766image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:57.143053image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:52.253458image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:53.159372image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:54.075232image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.000874image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2021-12-09T15:42:55.888175image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2021-12-09T15:43:01.308194image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-09T15:43:01.508575image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-09T15:42:57.394341image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-09T15:42:57.759207image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

pr_userrev_userFront-MatterUmfangBelegeLinksFormatierungVerlinkungen vorhandenAbbildungenStrukturierung des BeitragsRechtschreibungSpracheInhaltEinordnung in den KontextAnsprechend für Zielgruppe
0AnnaSophieNiTommiMuellerjajajajajaneinja1081010510
1AnnaSophieNiGilchusjajajajajaneinja1091010710
2GilchusTommiMuellerjajaneinneinneinneinja7891057
3TommiMuellerChrisBohljajajajajajaja9101010109
4TommiMuellerbiancamgjajajajajajaja81010101010
5eddq2000ChrisBohljajajajajajaja101010888
6eddq2000ChristinaHartungjaneinjajajajaja9101081010
7eddq2000biancamgjaneinjajajajaja1010108810
8ChrisBohlvegas1337neinjajajaneinjaja910109810
9ChrisBohlChristinaHartungneinjajajaneinjaja101099810

Last rows

pr_userrev_userFront-MatterUmfangBelegeLinksFormatierungVerlinkungen vorhandenAbbildungenStrukturierung des BeitragsRechtschreibungSpracheInhaltEinordnung in den KontextAnsprechend für Zielgruppe
722hello-pukekoLeon1906jajajajajajaja99810910
723fadiaraboGilchusjaneinjajajajaja988879
724fadiaraboGilchusjajajajajajaja109910710
725fadiaraboAnnaSophieNijaneinneinneinjaneinja576455
726fadiaraboLeon1906jajajajajajaja777767
727Leon1906TommiMuellerjajaneinjajaneinja898757
728Leon1906Gilchusjajajajajaneinja91097910
729Leon1906Gilchusjajajajajaneinja910109710
730Leon1906TommiMuellerjajaneinjajaneinja1076857
731Leon1906AnnaSophieNijajajaneinjaneinja788869

Duplicate rows

Most frequently occurring

pr_userrev_userFront-MatterUmfangBelegeLinksFormatierungVerlinkungen vorhandenAbbildungenStrukturierung des BeitragsRechtschreibungSpracheInhaltEinordnung in den KontextAnsprechend für Zielgruppe# duplicates
0GEDA9263Magnus-schnjajajajajajaja1010101010102
1LaHe27LeahMtmbjajajajajajaja1098101092
2Lukas1401nicolasrmgjajajajajajaja1010101010102
3Ne88peniTeemomajajajajajaneinja8999992
4Ne88peniTeemomajajajajajaneinnein991091092
5YeldaUzunxyilyjajajajajajaja109101010102
6fa97fykaErsan42jajajajajaneinja981010992
7fa97fykaErsan42jajajajajaneinnein79991082
8fe94fiqyxyilyjaneinneinjaneinneinnein3448572
9ksushaWMeyu23jajajajajaneinnein1010109892